-
Notifications
You must be signed in to change notification settings - Fork 193
Add agent_policy_id and policy_revision_idx to checkin requests #9931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add agent_policy_id and policy_revision_idx to checkin requests #9931
Conversation
I'm not changing the default behaviour for the agent with regards to acks. |
Add the agent_policy_id and policy_revision_idx attributes to checkin requests.
0721e6f
to
cfb5df4
Compare
internal/pkg/agent/application/actions/handlers/handler_action_policy_change.go
Show resolved
Hide resolved
cfb5df4
to
9535385
Compare
85cfaef
to
67a3b80
Compare
|
💚 Build Succeeded
History
|
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good, but I have 2 questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thanks for the clarification.
elastic-agent/internal/pkg/agent/application/gateway/fleet/fleet_gateway.go Lines 161 to 176 in ab68480
After fleet checkin, the agent sends all actions through a channel to the dispatcher. They are executed concurrently with the checkin loop; the ticker by default has a 1s duration with up to 500ms jitter. There is no guarantee that the POLICY_CHANGE action is executed before the next checkin. cc @blakerouse |
@michel-laterman Thanks for the clarification from the call today. I don't think this should be an issue with this PR, but we might want to make just the policy change blocking, at least until we know its either applied or not applied. That would really reduce the load on Fleet Server, could be a scale improvement really. We could do something like: ctx, cancel := context.WithTimeout(ctx, 5 * time.Second)
defer cancel()
waitForPolicyApply := f.handleActions(actions)
select {
case <-waitForPolicyApply:
case <-ctx.Done():
} |
Created #10130 to track |
@michel-laterman Thanks! |
* upstream: (505 commits) Update journald tests now that Filebeat supports watching folders (#10131) [deploy/kubernetes]: add info about hostPID for Universal Profiling (#10173) Fall back to process runtime if otel runtime is unsupported (#10087) Conditionall check for ms_tls13kdf build tag (#10160) [docs][edot] add entry for profiles (#10163) edot/docs: add support for profiles (#10146) Add Logstash exporter (#10137) Add back publish to serverless. (#10159) Improve Integration test documentation (#10155) Fix multiarch service image push from main to serverless (#10129) Forward migrate action to endpoint (#9801) Comment out check for ms_tls13kdf tag for FIPS-capable binaries (#10148) [otel] add receivers: apache, iis, mysql, postgresql, sqlserver v0.135.0 (#9344) Add k8sevents receiver in kube-stack (#10086) feat: emit system resource metrics for EDOT subprocess (#10003) [AutoOps] Configure OTel Exporter to Send Maximum-sized Batches (#10126) keep enrollment token when replacing data with signed (#10115) Revert "Publish `elastic-agent-service` container directly to serverless from main (#9583)" (#10127) Add agent_policy_id and policy_revision_idx to checkin requests (#9931) remove resource/k8s processor and use k8sattributes processor for service attributes (#10108) ...
What does this PR do?
Add the agent_policy_id and policy_revision_idx attributes to checkin requests.
These attributes are sources from the action stored as a part of the state.
Add a feature flag to disable sending acks for policy change actions; behaviour for policy change acks has not been changed with this addition (they are always sent).
Why is it important?
The policy information in fleet-server and agent may go out of sync; this may occur in cases where a VM restores from a snapshot.
Checklist
I have made corresponding changes to the documentationI have made corresponding change to the default configuration files./changelog/fragments
using the changelog toolDisruptive User Impact
N/A
Related issues